What? Why? How?
27/01/2021
What? Why? How?
What?
Web Scraping = Web data extraction
Why?
To gather (scrap) data from the internet if:
How?
Let’s try (basic example)
HTML is behind everything on the web.
Briefly understand syntax rules, browser presentation, tags and attributes to parse HTML and scrape the web for the information we need.
A webpage is NOT an HTML document. Basically, an HTML document can be opened using a text editor.
The HTML code tells a browser how to show a webpage (what goes into a headline, what goes into a text, etc.). We need to BRIEFLY understand the underlying marked up structure to understand how to scrape it.
Example here with Lancaster University website: